home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Atari Mega Archive 1
/
Atari Mega Archive - Volume 1.iso
/
gnu
/
gawk
/
gawk213b.zoo
/
gawk.man
< prev
next >
Wrap
Text File
|
1991-08-04
|
38KB
|
961 lines
GAWK(1) Utility Commands GAWK(1)
NAME
gawk - pattern scanning and processing language
SYNOPSIS
gawk [ -W gawk-options ] [ -Ffs ] [ -v var=val ] -f
program-file [ -- ] file ...
gawk [ -W gawk-options ] [ -Ffs ] [ -v var=val ] [ -- ]
program-text file ...
DESCRIPTION
Gawk is the GNU Project's implementation of the AWK program-
ming language. It conforms to the definition of the
language in the POSIX P1003.2 Command Language And Utilities
Standard (draft 11). This version in turn is based on the
description in The AWK Programming Language, by Aho, Ker-
nighan, and Weinberger, with the additional features defined
in the System V Release 4 version of UNIX awk. Gawk also
provides some GNU-specific extensions.
The command line consists of options to gawk itself, the AWK
program text (if not supplied via the -f option), and values
to be made available in the ARGC and ARGV pre-defined AWK
variables.
Gawk accepts the following options, which should be avail-
able on any implementation of the AWK language.
-Ffs Use fs for the input field separator (the value of the
FS predefined variable).
-v var=val
Assign the value val, to the variable var, before exe-
cution of the program begins. Such variable values are
available to the BEGIN block of an AWK program.
-f program-file
Read the AWK program source from the file program-file,
instead of from the first command line argument. Mul-
tiple -f options may be used.
-- Signal the end of options. This is useful to allow
further arguments to the AWK program itself to start
with a ``-''. This is mainly for consistency with the
argument parsing convention used by most other POSIX
programs.
Following the POSIX standard, gawk-specific options are sup-
plied via arguments to the -W option. Multiple -W options
may be supplied, or multiple arguments may be supplied
together if they are separated by commas, or enclosed in
quotes and separated by white space. Case is ignored in
arguments to the -W option. Gawk recognizes the following
arguments to -W:
compat Run in compatibility mode. In compatibility mode,
gawk behaves identically to UNIX awk; none of the
GNU-specific extensions are recognized.
copyleft
copyright Print the short version of the GNU copyright
information message on the error output.
version Print version information for this particular copy
of gawk on the error output. This is useful
mainly for knowing if the current copy of gawk on
your system is up to date with respect to whatever
the Free Software Foundation is distributing.
posix This turns on compatibility mode, with the follow-
ing additional restrictions:
o \x escape sequences are not recognized.
o The synonym func for the keyword function is
not recognized.
o The operators ** and **= can not be used in
place of ^ and ^=.
lint Provide warnings about constructs that are dubious
or non-portable to other AWK implementations.
Any other options are flagged as illegal, but are otherwise
ignored.
AWK PROGRAM EXECUTION
An AWK program consists of a sequence of pattern-action
statements and optional function definitions.
pattern { action statements }
function name(parameter list) { statements }
Gawk first reads the program source from the program-file(s)
if specified, or from the first non-option argument on the
command line. The -f option may be used multiple times on
the command line. Gawk will read the program text as if all
the program-files had been concatenated together. This is
useful for building libraries of AWK functions, without hav-
ing to include them in each new AWK program that uses them.
To use a library function in a file from a program typed in
on the command line, specify /dev/tty as one of the
program-files, type your program, and end it with a ^D
(control-d).
The environment variable AWKPATH specifies a search path to
use when finding source files named with the -f option. If
this variable does not exist, the default path is
".:/usr/lib/awk:/usr/local/lib/awk". If a file name given
to the -f option contains a ``/'' character, no path search
is performed.
Gawk executes AWK programs in the following order. First,
gawk compiles the program into an internal form. Next, all
variable assignments specified via the -v option are per-
formed. Then, gawk executes the code in the BEGIN block(s)
(if any), and then proceeds to read each file named in the
ARGV array. If there are no files named on the command
line, gawk reads the standard input.
If a ``file'' named on the command line has the form var=val
it is treated as a variable assignment. The variable var
will be assigned the value val. This is most useful for
dynamically assigning values to the variables AWK uses to
control how input is broken into fields and records. It is
also useful for controlling state if multiple passes are
needed over a single data file.
If the value of a particular element of ARGV is empty (""),
gawk skips over it.
For each line in the input, gawk tests to see if it matches
any pattern in the AWK program. For each pattern that the
line matches, the associated action is executed. The pat-
terns are tested in the order they occur in the program.
Finally, after all the input is exhausted, gawk executes the
code in the END block(s) (if any).
VARIABLES AND FIELDS
AWK variables are dynamic; they come into existence when
they are first used. Their values are either floating-point
numbers or strings, or both, depending upon how they are
used. AWK also has one dimension arrays; multiply dimen-
sioned arrays may be simulated. There are several pre-
defined variables that AWK sets as a program runs; these
will be described as needed and summarized below.
Fields
As each input line is read, gawk splits the line into
fields, using the value of the FS variable as the field
separator. If FS is a single character, fields are
separated by that character. Otherwise, FS is expected to
be a full regular expression. In the special case that FS
is a single blank, fields are separated by runs of blanks
and/or tabs. Note that the value of IGNORECASE (see below)
will also affect how fields are split when FS is a regular
expression.
If the FIELDWIDTHS variable is set to a space separated list
of numbers, each field is expected to have fixed width, and
gawk will split up the record using the specified widths.
The value of FS is ignored.
Each field in the input line may be referenced by its posi-
tion, $1, $2, and so on. $0 is the whole line. The value of
a field may be assigned to as well. Fields need not be
referenced by constants:
n = 5
print $n
prints the fifth field in the input line. The variable NF
is set to the total number of fields in the input line.
References to non-existent fields (i.e. fields after $NF),
produce the null-string. However, assigning to a non-
existent field (e.g., $(NF+2) = 5) will increase the value
of NF, create any intervening fields with the null string as
their value, and cause the value of $0 to be recomputed,
with the fields being sepa